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Abstract 

The growing need for labeled training data has made crowdsourcing an important part of machine 
learning. The quality of crowdsourced labels is, however, adversely affected by three factors: (1) the 
workers are not experts; (2) the incentives of the workers are not aligned with those of the requesters; and 
(3) the interface does not allow workers to convey their knowledge accurately, by forcing them to make 
a single choice among a set of options. In this paper, we address these issues by introducing approval 
voting to utilize the expertise of workers who have partial knowledge of the true answer, and coupling it 
with a (“strictly proper”) incentive-compatible compensation mechanism. We show rigorous theoretical 
guarantees of optimality of our mechanism together with a simple axiomatic characterization. We also 
conduct preliminary empirical studies on Amazon Mechanical Turk which validate our approach. 


1 Introduction 


In the big data era, with the ever increasing complexity of machine learning models such as deep learning, 
the demand for large amounts of labeled data is growing at an unprecedented scale. A primary means of 
label collection is crowdsourcing, through commercial web services like Amazon Mechanical Turk where 
crowdsourcing workers or annotators perform tasks in exchange for monetary payments. Unfortunately, 
the data obtained via crowdsourcing is typically highly erroneous (|Kazai et'aL 2011 Vuurens et al.[ 201 1| 


Wais et al. 20101 due to the lack of expertise of workers, lack of appropriate incentives, and often the 


lack of an appropriate interface for the workers to express their knowledge. Several statistical aggregation 


methods (Dawid and Skene 1979 Whitehill et al. 2009t|Raykar et al^l 2010t Karger et al. 2011 Liu etal. 
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Figure 1: Illustration of a task with (a) the standard single selection interface, and (b) an approval-voting 
interface. 


1 
































2012 Zhou et aLj 2012[ Shah et aL| 2015| ) have been proposed in the literature for improving the quality of 
the data. Our approach complements these techniques in that we endeavor to obtain higher-quality labels 
directly via novel interface and incentive mechanisms while not increasing the labeling cost. 

The typical crowdsourcing labeling task consists of a set of questions such as images to be labeled, 
and each question is associated with a set of options. Each option is the name of a category and the true 
label for any question is one of these options. In principle, for each question, the worker is required to 
select the option that she believes is most likely to be correct. More formally, it involves eliciting the mode 
of the worker’s belief. Such a “single-selection” crowdsourcing setting has been studied extensively, both 
empirically and theoretically. 

In this paper, we consider an alternative “approval-voting” means of eliciting labels from the workers, 
wherein the worker is allowed to select multiple options for every questionj^ See Figure for an example. 
Approval voting is known to have many advantages over single-selection systems in psychology and social 
choice theory (Horst 1932t Coombs 1953|[Coombs et ^ 1956[ Colletj 1971 Brams and Fishburn 1978| 
Gibbons et al.j 1979|): it provides workers more flexibility to express their beliefs, and utilizes the expertise 


of workers with partial knowledge more effectively. For instance, Coombs (19531 posits that “It seems to 
be a common experience of individuals taking objective tests to feel confident about eliminating some of 
the wrong alternatives and then guess from among the remaining ones” and that “Individuals taking the test 
should be instructed to cross out all the alternatives which they consider wrong.” Under this approval-voting 
interface, we will require a worker to select every option which she believes could possibly be correct. Math¬ 
ematically, we formulate this problem as eliciting the support of the beliefs of workers for each question. 
In the setting of crowdsourcing, as compared to single-selection, selecting multiple options would allow for 
obtaining more information about the partial knowledge of these non-expert workers. This additional infor¬ 
mation is particularly valuable for difficult labeling questions, allowing for the identification of the sources 


of difficulty. Indeed, Coombs et al. (19561 conclude that under such a questionnaire, “clear evidence for the 
existence of partial information mediating responses to multiple choice items was obtained.” 

Fet us illustrate the utility of approval voting using an example in Figure [T] Assume that there are two 
workers. The first worker believes the true label to be either “cheetah” or “leopard”, but certainly not any 
other option; the second worker is confused about some other aspeet of the image, and believes the true 
label to be either “cheetah” or “jaguar”, but certainly none of the others. If each worker is allowed to select 
only a single answer, it may turn out that the first worker selects “leopard” and the second worker selects 
“jaguar”. Their responses will thus not provide any definitive answer about the true label. In contrast, if we 
fully elicit their knowledge by letting them select multiple options, that is, (“cheetah”, “leopard”) from the 
first worker and (“cheetah”, “jaguar”) from the other worker, then “cheetah” becomes a clear winner. 

Albeit its great flexibility in eliciting partial knowledge, approval voting alone is not sufficient for high 
quality crowdsourcing. A worker may have no incentive to truthfully disclose her partial knowledge on the 
crowdsourcing question. For instance, the worker may simply choose all provided options as her answer 
and get paid. To address this problem, we need to couple approval voting with an appropriate “incentive- 
compatible” payment mechanism such that a worker receives her maximum expected payment if and only if 
she truthfully discloses her partial knowledge (that is, the support of her belief) on the crowdsourcing ques¬ 
tion. In other words, the payment mechanism has to be a “strictly proper scoring rule”. Moreover, we want 
the mechanism to be “frugal”, paying as less as possible to a worker who simply selects all provided options 
as her answer. The problem setting for incentive mechanism design is formally described in Section 

Our first result is negative, proving that unfortunately no mechanism can be incentive compatible for this 
setting (Section]^. This impossibility result leads us to introduce a “coarse belief” assumption that relies 
on a certain granularity in people’s beliefs. 

Our next result is the design of a payment mechanism and associated proofs showing that our mechanism 


*The literature on psychology often refers to approval voting as “subset selection”. 
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is incentive compatible and frugal (Section]^. Furthermore, we show that it is the only mechanism which 
satisfies these two requirements. 

We then generalize the analysis of our mechanism to settings where the coarse belief assumption may not 
be satisfied, and show that our mechanism simply incentivizes workers to select options for which their belief 
is relatively high enough (Section [^. This perspective also leads to a simple axiomatic characterization of 
our mechanism. 

We then report results from preliminary experiments verifying certain basic hypotheses underlying our 
approach (Section]^. The paper then diversifies fo invesfigafe fwo closely relafed sellings, that of general 
utility functions (Section]^ and that of a problem of reporting only high enough beliefs (Section [^. The 
paper concludes with a discussion in Section]^ 


Related literature 


Approval voting ( [Ottewell 1977 Kellett and Mott 1977 Weber[[r977t Brams and Fishburn 19781 is a form 
of voting in which each voter can “approve of” (that is, select) multiple candidates. No further preferences 
among these candidates is specified by the voter. Our proposed interface for crowdsourcing elicits approvals 
on the candidate options for each question. Closer to our setting of crowdsourcing, approval voting has been 
studied in the context of question and answer forums ( Jain et aL| 20091 and Doodle polls ( Zou et aL||2014 l. 
The focus of the present paper is on the design of incentive mechanisms with properties that fundamentally 
hold irrespective of the nature of the setting. 

The framework of scoring rules (Brier 1950 Savage 19711 Gneiting and Raft5^ 2007 Lambert and 
Shoham 20091 considers the design of payment mechanisms to elicit predictions about an event whose 
actual outcome will be observed in the future. The payment is a function of the agent’s response and the 
outcome of the event. The payment is called “strictly proper” if its expectation, with respect to the belief of 
the agent about the event, is strictly maximized when the agent reports her true belief. Proper scoring rules 
however provide a very broad class of mechanisms, and do not specify any particular mechanism for use. 
The mechanism proposed in the present paper may alternatively be viewed as the “optimal” proper scoring 
rules for eliciting supports of workers’ beliefs across multiple questions. 

[Shah and Zhou (20151 consider a crowdsourcing setup with the traditional single-selection setting, also 
eliciting the workers’ confidences for each response. They propose a mechanism to suitably incentivize 
workers and show that their proposed mechanism is shown to be the only one satisfying a proposed “no-free- 
lunch” axiom. While the setting of our work is different from that of [Shah and Zhou (20151, interestingly, 
our mechanism that was derived for a different interface and under a different set of assumptions, turns out 
to be the only mechanism that can satisfy the no-free-lunch axiom (adapted to our setting). 

The mechanisms presented subsequently in the present paper assume the presence of some “gold stan¬ 
dard” questions whose answers are known apriori to the system designer. There is a parallel line of litera¬ 


ture ( |Prelec[ 2004{ Miller et al. 2005 1 Faltings et al.[ 2014 Miller et aLj|2005[ Dasgupta and Ghosh 20131 

that explores the design of mechanisms that operate in the absence of any gold standard questions. These 
works typically elicit additional information from the workers, such as asking them to predict the responses 
of other workers. The mechanisms designed therein can generally provide only weaker guarantees due to 
the absence of a gold standard answer to compare with. 


2 Problem setup 

Consider N > 1 questions, each of which has B > 2 options to choose from. For each option, exactly 
one of the B options is correct. We assume that these N questions contain G {I < G < N) “gold 
standard” questions, that is, questions to which the mechanism designer knows the answers apriori. These 
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gold standard questions are assumed to be mixed uniformly at random among the N questions, and the 
worker is evaluated based on her performance on these G questions. For every individual question, we 
assume that the worker has, in her mind, a distribution over the B options representing her beliefs of the 
probabilities of the respective options being correct. We assume that these belief-distributions of a worker 
are independent across questions ( [Gibbons et al. 19791. For any integer K, we will use the standard notation 
of [K] as a shorthand for the set {1,..., 7F}. 

Our goal is to elicit, for every question, the support of the worker’s distribution over the B options. In 
other words, we wish to incentivize the worker such that for each question, the worker should select the 
smallest subset of the set of options such that the correct answer according to her belief lies in the selected 
subset. Formally, suppose that for any question i G [N ], the worker believes that the probability of option 
b G [B] being correct is pib, for some non-negative values pn, ■.. ,PiB that sum to one. Then the goal is to 
incentivize the worker to, for each question i G [N], select precisely the set of options 


{b G [B] I Pib / 0}. 


( 1 ) 


Payment function. As mentioned earlier, the worker’s performance is evaluated based on her responses 
to the gold standard questions. For any question in the gold standard, we denote the evaluation of the 
worker’s performance on this question by a value in the set {—(73 — 1),..., —1,1,..., B}: the magnitude 
of this value represents the number of options she had selected and the sign is positive if the correct answer 
was in that subset and negative otherwise. For instance, if the worker selected four options for a certain gold 
standard question but none of them was correct, then the evaluation of this response is denoted as “—4”; 
if the worker selects two options for a gold standard question and one of them turns out to be the correct 
option then the evaluation of this response is denoted as “+2”. 

We will assume that the payments are bounded, that is, any payment must lie in the interval [amin, amax], 
for some values amin and Omax > Omin- The choice of the two parameters amin and amax may be made 
keeping various factors in mind, such as guidelines of the crowdsourcing platform used, the budget con¬ 
straints, and the minimum wage. We will assume that the values of the two parameters are given to us. 

Let 


f ■ { (-S 1)) • • ■ ) 1) 1) • • ■ ) ^ [®min) Ctmax] 

denote the payment function. It is this function / which must be designed in order to incentivize the worker. 
We will let that a worker who answers everything perfectly should be paid an amount amax, that is. 


/(I, . . . , 1) — amax- 


( 2 ) 


Expected payment. A quantity central to our analysis is the expected payment, where the expectation is 
from the point of view of the worker, and is taken over the randomness in the choice of the G gold standard 
questions among the N questions, and over the N probability distributions representing her beliefs for the 
N questions. Let us formalize this notion. Suppose that for question i G [N], the worker has selected some 
Ui G [B] of the B options. Further, let Sj G [0,1] denote the probability, under the worker’s beliefs, that 
the correct answer to question i lies in this set of yi selected options. In other words, Si denotes the sum of 
the beliefs for the yi options selected by the worker (consequently, the sum of the beliefs for the options not 
selected is (1 — Sj)). Then from the worker’s point of view, her expected payment for this selection is 

The outer summation in (|^ corresponds to the expectation with respect to the random distribution of 
the G gold standard questions in the N total questions, and the inner summation corresponds to the expec¬ 
tation with respect to the worker’s beliefs of her choices being correct. In this paper, we assume that the 
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workers aim to maximize their expected rewards; extending our theory to more general utility functions is 
straightforward. 

Given the presence of gold standard questions, the performance of any worker is based only on her 
responses to questions to which answers are already known by the mechanism designer, the payments made 
to different workers do not depend on each other and hence we consider only one worker without loss of 
generality. 

Goal. The goal is to design mechanisms that are incentive compatible: 

Definition 1 (Incentive compatibility). A mechanism is incentive compatible if the expected payment (Equa¬ 
tion ^), from the worker’s point of view, is strictly maximized when she selects precisely the support (Equa¬ 
tion of her belief for each question. 

Note that the definition of incentive compatibility used here considers a “strict” maximization. 

Observe that a worker who selects all the options for all the questions doesn’t give any useful informa¬ 
tion. In order to deter such “freeloading” behavior, one would like to ensure that in addition to paying a 
(large enough) amount a to a good worker, the mechanism should expend as small an amount as possible 
on such a worker. This leads to a notion of “frugality”. 

Definition 2 (Frugality). An incentive-compatible mechanism f is frugal if 

for every incentive-compatible mechanism f that has f'{l, . ■ ■, 1) = /(I,..., 1). 

Our goal is to design mechanisms that are incentive-compatible, and whenever they exist, find the mech- 
anism(s) that is (are) most frugal. 


3 An impossibility result and a coarse-beliefs assumption 

It turns out that, unfortunately, we must face a roadblock in the first step: We can show that there exists no 
mechanism that is incentive compatible. 

Theorem 3.1. Eor any N, G and B >2, there is no mechanism that can guarantee that the worker will be 
incentivized to select precisely the support of her distribution for each question. 

The proof of this result and other theoretical results (except Theorem |4.1| ) are provided in the appendix. 
In order to circumvent this impossibility result, we appeal to a certain well-understood property of 
human belief. 


Coarse beliefs assumption 


There is an extensive literature in psychology establishing the coarseness of processing and perception in 
humans. For instance. Miller’s celebrated paper (Miller[ 19561 establishes the information and storage 


capacity of humans, that an average human being can typically distinguish at most about seven states. This 


Saaty and Ozdemir 

20031 

Jones and Loe (20131 establish the ineffectiveness of finer-granularity response 

elicitation. Mullainathan et al. ( 

20081 hypothesize that humans often group things into categories; this 


hypothesis is experimentally verified by Siddiqi (20111 in a specific setting. We incorporate this established 


notion of coarseness of human processing in our model in terms of a simple assumption. 

Consider some (fixed and known) value p > 0, and assume that the probability of any option for any 
question, according to the worker’s belief, is either zero or greater than p. The impossibility shown in 
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Theorem |3. 1 1 pertains to p = 0. Also, one must necessarily take into account situations when a worker is 
totally clueless about a question, that is, when her belief is distributed uniformly over all options. Hence we 
restrict p < 5. To summarize, we make the following “coarse belief” assumption. 

Definition 3 (Coarse belief assumption). The worker’s belief for any option for any question lies in the set 
{0} U {p, 1] for some (fixed and known) p £ (O, 5 ). 

We wish to elicit the full support of the workers’ beliefs, given a coarseness of belief that assigns a value 
of zero to very low probability categories. The goal is to design mechanisms that are incentive-compatible 
and frugal, assuming the coarse belief assumption holds true. 

4 Incentive mechanism 

Mechanism[T]presents our proposed mechanism for the problem at hand, under the coarse belief assumption. 


Mechanism 1 Incentive mechanism for approval voting 

• Input: Evaluations of the worker’s answers to the G gold standard questions 

(xi,... ,xg) G {-{B - 1),..., -1,1,... 

• Output: The worker’s payment 

^ G 

/(xi , . . . , Xq) — (cTmax Ctmin) (1 ^ ^ ^{Xi 1} T Qmin 

i=l 


The payment is based only on the evaluation of the worker’s responses to the gold standard questions. It 
is easy to describe the mechanism in words: The payment is amin plus 

• 0 if the correct answer is not selected for any of the questions, otherwise 

• (omax — Omin) reduced by ( 100 / 9 )% for each incorrect option selected. 

The following pair of theorems present our main results, proving that this mechanism is the one and 
only mechanism that satisfies our requirements. 

Theorem 4.1. Under the coarse-beliefs assumption, Mechanism^is incentive-compatible and frugal. 

The following theorem shows that our mechanism is strictly better than any other mechanism. 

Theorem 4.2. Under the coarse-beliefs assumption, there is no other incentive-compatible mechanism that 
expends as small an amount as Mechanism^on a worker who does not attempt any question. 

To show the optimality and uniqueness properties claimed in Theorem |4.1| and Theorem |4.2| respectively, 
we prove the absence of other good mechanisms via contradiction-based arguments. Specifically, for any 
candidafe mechanism, we identify a sef of beliefs for which fhe worker will nof be incenfivized fo acf as 
required. In line wifh our earlier argumenf of beliefs being “coarse”, fhe beliefs considered in fhese proofs 
are simple enough: fhe worker has some belief abouf one of fhe opfions, knows for sure fhaf cerfain ofher 
opfions are incorrecf, and is indifferenl among fhe resf of fhe opfions. 
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To put things in perspective, observe that p = 0 eliminates the dependence of the payment in Mecha¬ 
nism 0on Xi and makes the mechanism incentive incompatible. The impossibility result of Theorem |3.1 
proves that every possible mechanism must necessarily suffer this fate. 

The remainder of this section is devoted to the proof of Theorem 4.1 The reader may feel free to jump 
to Sectionj^without any loss in continuity. 


Proof of Theorem 14. II 

Without loss of generality, assume that amin = 0 since in our setting, the property of incentive compatibility 
is invariant to any constant shift and positive scale of the payment. We adopt the succinct notation of 

(X .— Q^max ^min- 

Incentive compatibility. First consider the case = G = 1. In this case, Mechanism[T]reduces to 

f{x) = a(l - > !}• 

Suppose without loss of generality that the worker’s beliefs for the B options are pi > • • • > pm > P > 
Pm+i = • • • = ps = 0 for some m G [i?]. An incentive-compatible mechanism must strictly maximize the 
worker’s expected payment when she selects the support of her belief, that is, the options m}. The 

expected payment, $sup> under this selection is 


$sup — a — p)™ ^ 

i=l 

= {i-pr-\ 

Suppose the worker selects some other set of options {oi,..., o^} C [B], {oi,..., o^} ^ [m]. Then her 
expected payment under the proposed mechanism for this selection is 


t 

Vh = - pY~^ 

i=l 

I 

< a'^Piil - pf-^, ( 4 ) 

i=l 

since pi > ••• >pB. If£ = m then the inequality in Q is strict since pj < pi for all (j > m, i < m). Thus 
the expected payment under the choice i = m but with a selection different from the support is strictly lower 
than $sup- Also observe that the expected payment on selecting i > m is upper bounded by (1 — pY~^, 
which is strictly smaller than $sup- Let us now consider the remaining, interesting case of ^ < m. Since 
Pi > p for all i G [m], we have 


^oth < 

a Y,Pi- 

- (m — i)p 



\i=i 


/ 

= 

a (1 — (m 

- ^)P) (1 - 

pf- 

< 

a (1 — (m 

-(^ + 1 ))P)(1- 

< 

«(i-pr 

-1 


= 

$sup. 




\r-i 
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This completes the proof for the case = G = 1. 

Let us now consider the case of = G > 1. By our assumption of the independence of the beliefs of 
the worker across the questions, the expected payment equals 


G 


i=l 




Since the payments are non-negative, if each individual component in the product is maximized then the 
product is also necessarily maximized. Each individual component simply corresponds to the setting of 
A^ = G = 1 discussed earlier. Thus calling upon our earlier result, we get that the expected payment for the 
case A/^ = G > 1 is maximized when the worker acts as desired for every question. 

Let us finally consider the case of A^ > G > 1. Recall from Q that the expected payment for the 
general case is a cascade of two expectations: the outer expectation is with respect to the uniformly random 
distribution of the G gold standard questions among the N total questions, while the inner expectation 
is taken over the worker’s beliefs of the different questions conditioned on the choice of the gold standard 
questions. The arguments above for the case N = G prove that every individual term in the inner expectation 
is maximized when the worker acts as desired. The expected payment is thus maximized when the worker 
acts as desired. 

Frugality. We first present a lemma that forms the workhorse of this and other subsequent proofs. 

Lemma 4.3. Consider some y, y' G [B]^ and some I C [A^] such that yi = y'i + ^for all f G X, and yi = y[ 
for all z ^ X. Then any incentive compatible mechanism f must necessarily satisfy 

E /(w.. ■■■.»») £7^ E 

''Ci Oi,...jG)C[Ar] Oi,...,jG)C[Af] 

Furthermore, a necessary condition for the above equation to be satisfied with equality is 


for all (ji,.. .Jg) C [A^], and all {(ei,..., eg) G {-1,| e* = 1 whenever ji ^ X}. 


The proof of the lemma is provided in the appendix. We now prove the frugality of our proposed 
mechanism using this lemma. Consider any incentive compatible mechanism / such that /(I,..., 1) = a. 
Consider any xq e [B — 1]. Applying Lemma 4.3 with y = {xq + 1,..., xq + 1)^ = (^o, ■ ■ ■, xq) and 

X = [G] gives 


f(xo + 1,..., Xo + 1) > (1 - p)'^f(xo ,..., Xo). 


A repeated application of this inequality for all xq £ [B — 1] gives 

Mechanism [^achieves this lower bound on f{B,... ,B) with equality, thereby completing the proof. 


5 Robustness to the coarse beliefs assumption 

We earlier made the “coarse belief” assumption that the worker’s belief for any option, when non-zero, 
is atleast p. We then designed the Mechanism [T] that is incentive compatible with respect to eliciting the 





supports of the beliefs of the worker. A natural question then arises is: How does the mechanism perform if 
the coarse beliefs assumption is violated? Does the mechanism break down? 

In this section, we generalize the results presented earlier in the paper to the setting where workers may 
have arbitrary beliefs. It turns out that our proposed mechanism continues to incentivize workers to act in a 
certain desirable way. 

5.1 Incentivizing workers with finer beliefs 

Suppose that Mechanism [T] (for a certain value of p) is encountered by a worker who may have arbitrary 
beliefs. Interestingly, it turns out that the mechanism doesn’t break down, but instead does something 
desirable: it incentivizes the worker to select all options for which the relative belief of the worker is high 
enough. 

Theorem 5.1. Under Mechanism^ for any question, a worker with beliefs 1 > pi > ... > pb > 0 will be 
incentivized to select options m} where 

( Pz ^ 

m = arg max I - > p 

z&[m] \^i=lPi 



It is not hard to interpret this incentivized action. The worker selects options one by one in decreasing 
order of her beliefs as long as the selected option contributes a fraction more than p to the total belief of the 
selected options. 


Let us now verify that the earlier result of Theorem 4.1 for “coarse beliefs” is indeed a special case of 
Theorem o To this end, suppose the beliefs of the worker for any particular question are pi > ■ ■ ■ Pk > 
p > Pk+i = • • • = = 0 for some k G [B]. Then we have 


Pz 


E Z 

i=lPi £^i=lPi 


= 0 < p for all z > fc + 1, 


and 


Pz 


^Z >-r>P 
22i=iPi 1 


for all z < k. 


It follows that under the result of Theorem |5.1| a worker with “coarse beliefs” will be incentivized to select 
precisely the support of her beliefs. 


5.2 An axiomatic derivation 

We now present an alternative axiomatic derivation of our mechanism when accommodating workers with 
arbitrary beliefs. The derivation involves a “no-free-lunch axiom” of Shah and Zhou ( |2015 1, which when 
adapted to our approval-voting based setting is defined as follows. We say that a worker has ‘attempted’ a 
question if for that question, she doesn’t select all the B options. We say that the answer to a question is 
wrong if the correct option does not lie in the set of selected options. 


Definition 4 (No-free-lunch; adapted from Shah and Zhou ( |2015 1). If the answer to every attempted question 
in the gold standard turns out to be wrong, then the worker gets a payment of zero, namely. 


fixi, ...,xg) = 0 V (xi, ..., xg) G {-{B - 1), ..., -1, B}^\{B}^. 
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Select ALL options that could 
be the animal in this image. 

0 Cheetah 

□ Lion 

□ Tiger 
0 Leopard 
0 Jaguar 

□ Puma 

(a) 



Select ALL options that could 
be the texture in this image. 

0 Sand 

□ Brick 
0 Grass 

□ Wood 

□ Cloth 
0 Gravel 

(b) 


Figure 2: Illustration of two of the three experiments we conducted on Amazon Mechanical Turk. 


The no-free-lunch axiom is quantitatively different from the criterion of frugality proposed in this paper. 
However, both these notions have the same qualitative goal, namely to minimize the expenditure when no 
useful data is obtained, while providing higher payments to workers providing better data. Interestingly, 
as we show below, both these notions lead to the same (unique) mechanism under our setting of approval 
voting. 

Theorem 5.2. Consider no assumptions on the minimum value of the belief, and suppose the workers must 
be incentivized to select options m} where m = argmax^ P ^ ix 

the one and only mechanism that is incentive compatible and satisfies no-free-lunch. 

6 Preliminary experiments 

This section presents results from an evaluation of our proposed mechanism. Mechanism [T] on the popular 
Amazon Mechanical Turk (mturk . com) commercial crowdsourcing platform. The goal of this preliminary 
experimental exercise is to perform a basic check on whether our mechanism has the potential to work in 
practice. Specifically, our goal is to evaluate the primary hypotheses underlying the theory: (i) whether 
workers are able to make a judicious use of the approval voting setup, (ii) whether the existence of the 
mechanism make any difference, and (iii) if there is a opposition from the workers to the interface or the 
mechanism for any reason. 

It is important to keep in mind that conclusive experiments for mechanism design are in general quite 
expensive with respect to time (workers may need months to understand a new mechanism) and budget. 
They are unlike typical machine-learning experiments that require only existing benchmark datasets. More¬ 
over, the wordings or the interface may exert a significant influence on the workers’ behavior. Like most 
mechanism design papers, we position our work primarily as a theoretical study. We expect that more de¬ 
tailed experiments will follow the publication of our work; indeed, it is best if experiments on such incentive 
schemes are conducted by multiple groups. 

6.1 Methods 

We conducted three separate sets of experiments, with over 200 workers in each experiment: 

• Identifying languages from displayed text (Figure [T]) 

• Identifying animals in displayed images (Figure [2(aj] ) 

• Identifying textures in displayed images (Figure [2(bj]). 
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(a) Fraction of responses that evaluate to different values. The magnitude of the evaluation represents the number of options 
selected and its sign denotes whether the correct option was selected (positive) or not (negative). 



(b) Fraction wrong among at¬ 
tempted questions 


0.3 

0.2 

0.1 

0.0 

(c) Fraction wrong when only 
one option was selected 



(d) Average bonus per worker 
(cents) 


I 


Figure 3: Raw data from the three experiments conducted on Amazon Mechanical Turk. 


In each experiment, every worker was assigned one of four mechanisms uniformly at random. The variable 
component of each mechanism was executed as a “bonus payment” based on the evaluation of the worker’s 
performance on the gold standard questions, on top of a guaranteed payment of 10 cents (this was ctmin)- 
The four mechanisms tested were: 


• Single-selection interface with additive payments: The worker must select a single option for every 
question. The bonus starts at zero and is increased additively by a fixed amount for every correct 
answer. 


Skip-based single-selection interface with multiplicative payments (Shah and Zhou 20151: For every 
question, the worker can either select one option or skip the question. The bonus starts at a certain 
positive value, is reduced by a certain fraction for each skipped question, and becomes zero in case of 
an incorrect answer. 


• Approval-voting interface with a fixed payment: The bonus is fixed. 

• Approval-voting interface with Mechanism [T] 

Given the caveats associated to experiments on mechanism design as mentioned earlier, we provided detailed 
instructions about the task and the mechanism to each worker, and also made them work through multiple 
examples. The entire data related to the experiments, including the interfaces used, specifics about the 
payment mechanisms, and the responses of the workers, is available on the website of the first author. 


6.2 Results 

Let us first eyeball the raw data. Figure [^presents combined results from the three experiments. Figure [3(a^ 
shows the breakup of the evaluations of all the collected responses. The magnitude of the evaluation repre¬ 
sents the number of options selected and its sign denotes whether the correct option was selected (positive) 
or not (negative). Figure |3(b)| depicts the fraction of responses to attempted questions that turned out to 
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Experiment 

ijp2 

F 

P 

Languages 

15.7 

7.8 

0.0004 

Textures 

21.3 

10.7 

0.000025 

Animals 

10.2 

5.1 

0.0062 


Table 1: Hotelling’s T-squared test comparing the data from the fixed payment mechanism and the data from 
Mechanism[T]proposed in this paper. 


be wrong. Figure |3(c)| depicts the fraction of responses that were correct when only one option was se¬ 
lected. Figure [3(d)] depicts the average payment per worker. Using this data, let us now investigate the three 
questions posed at the beginning of the experiments: 

(i) Are the workers making a judicious use of the approval voting setup? One can observe from Fig¬ 
ure 3(a) more than 40% responses comprised a selection of two or three options, suggesting that the workers 


did understand the concept of approval voting. 

(ii) Does the presence of a mechanism makes a difference? We compared the data from the approval 
voting setup under the fixed mechanism with the data from the approval voting setup under Mechanism [T] 
In particular, we applied Hotelling’s T-squared test, where we treated the response by any worker to any 
question as a two-dimensional data point, with the number options selected and the correctness of the answer 
as the two dimensions. The results of this test are listed in Table [T] We could reject the null hypothesis (of 
the two sets of data being drawn from distributions with identical means) with p < 0.01 for each of the three 
experiments. 

(iii) Is there is an opposition from the workers to the interface or the mechanism for any reason? We also 
elicited feedback about the task from every worker, informing them that the feedback will not affect their 
payment. We received mostly neutral feedback, some positive feedback, and no negative feedback about 
either the approval voting interface or our mechanism. 

All in all, these preliminary experiments indicate that our mechanism is practical and can potentially 
be useful for many applications in machine learning, paying higher amounts to good workers and lower 
amounts to freeloaders or spammers. 

A concluding remark. A standard means of denoising data from crowdsourcing is to ask every question 
to multiple workers, and employ a statistical aggregation algorithm to aggregate the data so obtained. In the 
future, we wish to evaluate the performance of our proposed interface and mechanism on such aggregated 
data. To this end, our goal for the future is to design algorithms designed towards statistical aggregation of 
data collected through the interface and mechanism proposed in this paper. 


7 General utility functions 

In this section, we consider a setting where the worker, instead of maximizing her expected payment, aims 
to maximize the expected value of some utility function of her payment. 

Consider any function [/ : M —>• M. Suppose that instead of aiming to maximize the expected payment, 
the worker has some utility U for any payment made to her, and that she aims to maximize the expected 
utility. In other words, for any payment / made to the worker (based on the evaluation of her answers to the 
gold standard questions), her utility for this payment is U{f). The worker aims to maximize the expected 
value of U{f). 

We will require the function U to be strictly increasing and invertible. The results presented so far in the 
paper implicitly assumed that the utility is simply the identity function, namely U(x) = x. The function U 
is assumed to be public knowledge. 

Given the evaluations xi,... ,xg of the worker’s responses to the G gold standard questions, consider 
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the following payment mechanism: 


/(Xi, . . . , Xg) = U-^ - C/(amin))(l " ^ + t^(«min) j • (5) 

It is easy to see that the properties of Mechanism[T]carry over to this mechanism in the case of a general 
utility function U. This feature is formalized in the following proposition. 

Proposition 7.1. For a worker who aims to maximize function U of the payment, the mechanism in Equa¬ 
tion © is incentive compatible, frugal, and is the one and only incentive compatible mechanism to satisfy 
the no-free-lunch axiom. 


8 An alternative problem statement 

In earlier sections, we made the coarse belief assumption of the existence of some p G (0, 5 ) such that the 
belief of a worker for any option is assumed to either equal 0 or more than p. We then designed a mechanism 
to elicit the support of the worker’s belief under this assumption. A natural question that arises is that instead 
of making a coarse belief assumption, can we fix a parameter, say, a G ( 0 , 1 ), and incentivize the worker to 
select all options for which her belief is strictly greater a? Although not the primary focus of this paper, we 
devote the present section to investigating this complimentary setting out of intellectual curiosity as well as 
practical relevance. As we show below, the answer to this question is both yes and no. 


8.1 Problem setting 


For a given value of ct G (0,1), we will call a mechanism as incentive compatible if the expected payment of 
any worker is strictly maximized when the worker selects all options for which her belief is strictly greater 
than a. 

We retain most notation form Sectionwith a few exceptions as follows. We continue to let / denote 
the payment function; f ■. { — {B — 1),, B]^ —)■ [amin, «max]- Observe that unlike the setting considered 
earlier in Section]^ here we have included 0 in the domain of the payment function. This is because under 
the present setting, when a > ^, there is a possibility that the worker has a belief no more than a for each 
option, for instance, if the worker is totally clueless. 

Let us define fwo infegers Smin and Smax as Smin = and Smax = 

Observe fhaf if if a < ^ fhen if is meaningless fo lef fhe worker selecf zero opfions since fhe belief for 
af leasf one opfion musf be ^ or higher. Also observe fhaf for any value of cr G (0,1), if is meaningless 
fo allow fhe worker fo selecf or more opfions, since if is mafhemafically impossible for fhose many 
opfions fo have beliefs more fhan a. As a resulf, we will require fhe worker fo selecf af least Smin and at 
most Smax options for any question. The goal remains to design the payment function /(xi,..., xq) when 
\xi\ G {smin, • • •, Smax} for every i G [G]. If the worker’s responses do not satisfy this condition, then we 
assume the convention of setting the payment to a small enough value (say, amin or some further penalty). 

We do not assume the restriction of coarseness of the beliefs. We stick to the identity utility function, 
while noting that extension to other utility functions is straightforward following Section [7] 

Finally, we note some special cases which we exclude from the subsequent analysis. The case of a = 0 

proved earlier. The cases o^ B = 2 or a > ^ 


3.1 


degenerates to the impossibility result of Theorem 
degenerate to the “skip-based” single-selection setting studied in |Shah and Zhou| ( |2015| ). Hence we focus 
on the case of H > 3 and cr G (0, j) in the rest of this section. 


^The function 1 : {True, False} —>■ {0,1} is the indicator function, with l{x} = 1 if a; is true, and 0 otherwise. 
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8.2 Mechanism 


Mechanism 2 Incentive mechanism for the alternative problem formulation 

• Input: Evaluations of the worker’s answers to the G gold standard questions (xi,..., xq) 

• Output: Define a function p : M —)■ M as 

9{y) = {B - \y\)a + l{y > 1 }. 


The worker’s payment is 


G 

/(xi,...,xg) = a + b'^g{xi), 
i=l 


where a = amin and b = ■ 


Given the conventions described in the previous subsection for the payment function, it remains to 
construct the payment function under “normal” conditions, that is, when Smin < Xj < Smax for every 
z G [G]. Mechanism]^ now presents our proposed mechanism for this setting. 

Theorem 8.1. Consider any cr ^ (0,^), N > G > 1 and B > 3. Consider the goal of designing a 
mechanism such that for each question, the worker is incentivized to select every option for which her belief 
is more than a. Assume that no belief equals exactly a. Then Mechanism^is incentive compatible. 

The function g, in words, penalizes the selection of an incorrect option by a and rewards the selection of 
the correct option by 1. Under beliefs {pi, ... ,ps} for a gold standard question, when the worker answers 
as per our requirements, the expected value of g equals 'f2d=i oaax{pj, a}. 

The setting also permits a “multiplicative” mechanism, consistent with the earlier results in this paper. 

Corollary 8.2. Under the assumption that no belief equals exactly a, the mechanism 

G 

/(xi,..., xg) = a + 6 J|(fif(xi) - c), 
i=l 

for some constants a, b > 0 and c < p(—Smax). B also incentive compatible. 


8.3 Uniqueness and an impossibility result 


In this section, we show that the core structure of Mechanism]^ namely the function g, is essential for any 
mechanism. We also show that the (mild) assumption of no belief equalling exactly a is unavoidable. 

Theorem 8.3. Consider any cr G (0, and any B > 3. Consider the goal of designing a mechanism such 
that for each question, the worker is incentivized to select every option for which her belief is more than a. 
Then: 

(A) Under the assumption that no belief equals exactly a, when G = 1, the function g is the one and only 
incentive-compatible mechanism upto a constant shift and positive scaling. 

(B) For any N > G > 1, no mechanism is incentive compatible in the absence of this assumption. 


While we do not have a complete answer as to what the “best” or “unique” mechanism is for general 
values of N and G, but going by results proved earlier in the paper, we conjecture that the multiplicative 
version of the mechanism (Corollary 8.2 1 may possess attractive properties. Further exploration of this 
setting is beyond the scope of this paper. 
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9 Discussion and open problems 


Our goal is to deliver high quality labels for machine learning applications, at low costs, by means of 
incentive mechanisms or aggregation algorithms or both. In this paper, we pursue the former approach. 
We take an approval-voting based means of gathering labeled data from crowdsourcing. We design an 
incentive mechanism via a principled theoretical approach, and prove appealing properties of optimality 
and uniqueness of our proposed mechanism. Preliminary experiments conducted on Amazon Mechanical 
Turk corroborate the usefulness of this mechanism for practical scenarios. Our mechanism may also draw 
more experts to the crowdsourcing platform since their compensation will be significantly higher than that 
of mediocre workers, unlike most compensation mechanisms in current use. 

We conclude with a discussion on closely related topics that merit investigation in the future. 

Aggregation of labels. For the traditional single-selection setting, there is a long, existing line of work 
on statistical methods to aggregate redundant noisy data from multiple workers (|Dawid and Skene 1979 


Whitehill et aH[2009| Raykar et al.[ 2010t Karger et al.[ 201 1| Liu et al.l 2012t|Zhou et al.[ 20121. An open 
problem is the design of aggregation algorithms for approval-voting-based data: algorithms that can exploit 
the specific sfrucfure of fhe responses fhaf arise as a result of the approval voting interface and the proposed 
mechanism. There is indeed work on aggregation algorithms ( |Mass6 and Vorsatz 2008} Caragiannis et al.[ 
2010 Brams and Kilgour} 2014| Procaccia and Shah) 20151 and probabilistic models ( |Marley[ 1993[ Fal- 
magne and Regenwetter[ 1996t Doignon et al.[ 2004| Regen wetter and Tsetlin[ 20041 for approval-voting 
in the context of social choice theory; their objective, however, is primarily of fairness and stretgyproofing 
of the voting procedure, as opposed to our goal of denoising data obtained from multiple heterogeneous 
workers as required for labeling tasks in crowdsourcing. 

Choosing the right interface. There are tradeoffs between various interfaces for crowdsourcing. For 
instance, the approval voting interface elicits the support of the belief whereas the single selection interface 
elicits the mode. Choosing among these two interfaces would depend on the application under consideration, 
and moreover, one may adaptively switch between the two depending on the data obtained. A natural 
question that one may further ask is, why not elicit the entire belief distribution itself? While the entire 
belief distribution seems to supercede the support and the mode, stating the distribution will also require 
much more time and effort from the workers, and often also suffer from a higher noise. These tradeoffs 
must be taken into account when choosing the interface for the application at hand. 

The coarse beliefs parameter. One may wish to evaluate the value of p by explicitly asking workers on 


the crowdsourcing platform for this value. However, it is noted in the literature (e.g., see Shah et al. (20151 
for experiments on Amazon Mechanical Turk) that the cardinal representations that humans provide are 
not always consistent with their respective mental beliefs, and are far noisier. This phenomenon suggests 
the requirement of developing alternative methods of evaluating this parameter. Indeed, measurement is 
considered one of the most difficult parts of behavioral research. 

We look forward to future work exploring these topics in depth. 
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APPENDIX 
A Proofs 

In this section, we present proofs of the various theoretical results presented in the paper. 

A.l Proof of Theorem [XT]: Impossibility 

We assume that there indeed exists some incentive-compatible payment function /, and prove a contradic¬ 
tion. 

Let us first consider the special case oi N = G = 1 and B = 2. Since N = G = 1, there is only one 
question. Let pi > 0.5 be the probability, according to the belief of the worker, that option 1 is correct; the 
worker then believes that option 2 is correct with probability (1 — pi). 

When Pi = 1, we need the worker to select option 1 alone. Thus we need 

/(l)>/(2 ). 

When Pi G (0.5,1), we require the worker to select options 1 and 2, as opposed to selecting option 1 
alone. For this we need 


Pi/(1) + (1 -Pi)/(-l) < /(2) 

It follows that we need 

(1-pi)(/(l)-/(-!))>/(I)-/(2). (6) 

However, the inequality Q is satisfied only when /(I) > /(—I) and (1 — pi) > Thus for any 

given payment function /, a worker with belief (1 — pi) G (0, will not be incentivized to select 

the support of her belief. This yields a contradiction. 
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We now move on to the general case of > G > 1 and B > 2. Consider a worker who is clueless 
about questions 2 through N (i.e., her belief is uniform across all options for these questions). Suppose this 
worker selects all B options for these questions as desired. For the first question, suppose that the worker 
is sure that options 3,... ,B are incorrect. We are now left with the first question and the first two options 
for this question. Letting X denote a random variable representing the evaluation of the worker’s response 
to the first question, the expected payment then is 

^E[fiX,B,...,B)] + {l-^)f{B,...,B). 

The expectation in the first term is taken with respect to the randomness in X. Defining 

/ W := B,...,B) + {1- ^)f{B, ...,B), 

and applying the same arguments to / as those for / for the case ofN = G = l, B = 2 above gives the 
desired contradiction. This thus completes the proof of impossibility. 


A.2 Proof of Lemma HiS The workhorse lemma 

Consider some po £ {Pt ^)- Consider a worker such that for every question i G X, her belief is po for the 
first option and for each of the last {yi — 1) options. For every question i her belief is uniformly 

distributed among the first yi options. Now, if the worker selects precisely the support of her beliefs for 
every question then her expected payment $i is 

$1 = YI ( 7 ) 

'ci (il,...,iG)C[Af] 


We will compare the aforementioned action to another action, where for each question i £ Z, the worker 
selects only the last {yi — 1) options but not the first option; for each question i ^Z, the worker selects the 
support of her belief. Under this action, the expected payment $2 is 


$2 


7^ E E no- 

(JIv-Jg) (eiv.CG) 
c[Af] e{-i,i}G 




i}| 


Po 


1}I 


/(uyL. 




( 8 ) 


In the expression Q, the outer summation represents the expectation over the random choice of the G gold 
standard questions among the N questions. The inner summation represents the expectation with respect to 
the correctness or incorrectness of the answers to the G gold standard questions: for any question f, = 1 
captures the event where the question in the gold standard is answered correctly and e* = — 1 represents 
the event of this question being answered incorrectly. The term l{{ji \ e* = —1} U X} ensures that only 
the questions in X can be wrong, since it is only these questions for which the worker has selected a subset 
of her belief’s support. 

Since f{x) > 0 for all x, we can lower bound $2 as 

*2>X .fell/fa',,..., 4). (9) 
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An incentive compatible mechanism must incentivize the worker to perform the first action (over the sec¬ 
ond), i.e, must have $i > $2. Thus from Q and Q, we get 

7W { 1 - (10) 

(c) (ii,...,jG)c[Af] (g) {h,...jG)Q[N] 

Note that ( fTO] ) must hold for all po > P- The left hand side of ( fTO] ) does not involve po whereas the right 
hand side is continuous in po- It follows that 

'g) (ii,...,jG)C[iV] (g) 0i,...,jG)C[Af] 

This proves the first part of the lemma. 

We now move on to the second part of the lemma, concerning equality in ( [TT] ). Suppose /(eiy'^, ■ • •, ^Gl/j^) 
is strictly positive for any (ji,..., jg) ^ [N], {(ei,..., ec) G {-1,| e* = 1 whenever jj ^ I}. 
Then ( [TT] ) will necessarily be a strict inequality. The claimed necessary condition for equality is thus estab¬ 
lished. 


A.3 Proof of Theorem [4^ Frugality 

Without loss of generality, assume that amin = 0 since in our setting, the property of incentive compatibility 
is invariant to any constant shift and positive scale of the payment. We adopt the succinct notation of 

OC .— Q^max ^min- 

Consider any incentive compatible mechanism / such that /(1,...,1) = a and /(^ ... ,B) = (1 — 

p)G{B-l) 

a. We will show that this payment mechanism must be identical to Mechanism^ 

We consider the set of evaluations x whose elements are non-decreasing, i.e., xi > X 2 > ■ ■ ■ > xq', The 
proof for any other ordering follows in an identical manner. 

First consider any x such that xq > 0. 

• Let 7 (x) denote the number of distinct entries in x: 


G-l 

j(x) := 1 + ^ Hxi / Xi+i} 

i=l 


• Let (t{x) denote the size of the last jump in x: 

a{x) := Xj — Xj+i where j = argmaxxi 7 ^ Xi+i 

iS[G—1] 

• Let /3(x) denote the numeric value of a: in a B-wy number system: 

G 

P{x) := 

i=l 

For example, if B = 5, G = 5 and x = (5, 5, 4, 1,1) then 7 ( 0 :) = |{5,4,1}| = 3, a{x) = 4 — 1 = 3 (where 
j = 3), and /3(x) = 4 • 5^ -|- 4 • 5^ -|- 3 • 5^ -|- 0 • 5^ -|- 0 • 5° = 3075. The proof involves three nested levels 
of induction: on 7 , on a and then on /3. 
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We first induct on 7. The base case is the set {x\^{x) = 1}, i.e., the set of vectors which have the same 
value for all its components. Consider any xq G [5 — 1]. Applying Lemma |43] with y = (xq + I, • • ■ ,xo + l) 
andy' = (xo,...,xo) gives 


/(xo + 1 ,..., xo + 1 ) > (1 - p)'^/(xo,..., a:o). 

Since this inequality is true for every xq G [5 — 1], we have 

/(5,..., 5) > (1 - ..., xo) > (1 - ..., 1). 


Setting f{B, ..., 5 ) = (1 - a and /(I, ... ,1) = a proves the base case. 

Now suppose our hypothesis is true for all {x|7(x) < 70 — 1 } for some 70 G { 2 ,..., B], We will 
now prove that the hypothesis is also true for all {x|7(x) < 70}. Towards this goal, we will now induct on 
a. The set of all {x|7(x) = 70 — 1 } can be treated as a base case for our induction, with this base case 
corresponding to cr = 0 . Due to the induction hypothesis on 7, the base case of a = 0 is already proven. 

Now suppose that the hypothesis is true for all {x|7(x) = 70, cr(x) < cto — 1 } for some (Tq G [5 — Ij. 
We will prove that the hypothesis remains true for all {x|7(x) = 70, (t(x) = uo}. To this end, we will 
induct on / 3 . 

Recall that we have restricted our attention to those x which have their elements in a descending order. 
Observe that the element with the minimum value of /3 in the set {x|7(x) = 70, (t(x) = do} is (70 + do — 

1.. ..,do + 1 , 1 ,...,!). We will prove the hypothesis for this element as the base case for our induction 
on ( 3 . Applying Lemma [ 43 ] with y = (70 + do - 1 ,..., do + 2 , do + 1 , 1 ,..., 1 ) and y' = (70 + do- 

1.. .., do + 2, do, 1,..., 1) gives the inequality 


ci/(7o + do - 1,..., do + 2, do + 1,1,..., 1) + c'i/( 7 o + do - 1,..., do + 2,1,1,..., 1) 

+ (cs/(s, 1,1, • • •, 1)+c'^/(s,cro + 1,1,..., 1)) 

sG{ 70 +cto- 1 v, 0 - 0 + 2 } 

> Ci(l - p)/(7o + do - 1,... ,do + 2,do, 1,..., 1) + c'i/(7o + do - 1,...,do + 2,1,1,..., 1) 

+ (cs/(s, + c'^(l-p)/(s,cro,l,...,l)) , (12) 

s£{ 70 +o- 0 - 1 ,..., 0 - 0 + 2 } 


for some positive constants ci, c}, Cg , c'g (which represent the probabilities of the respective set of G ques¬ 
tions being chosen as the G gold standard questions). Now, for any s C {70 + do — 1,..., do + 2}, observe 
that 7 ( 5 , do + 1,1,..., 1) < 7 o — 1 and 7 ( 5 , do, !,...,!) < do — 1. Thus from our induction hypothesis, 
we have 


f{s, do + 1,1,..., 1) = (1 - p)f{s, do, 1,..., 1). (13) 

Also, 7(70 + do - 1,..., do + 2, do, 1,..., 1) = 70 and d( 7 o + do - 1,..., do + 2, do, 1,..., 1) = do - 1. 
Consequently from our induction hypothesis, we have 

/(70 + do - 1 , . . . , do + 2 , do, 1 , . . . , 1 ) = (1 - p)70+-0-2+...+ao+l+<xo-l«, ( 14 ) 


Substituting ( [T3l ) and ( [141 ) in ( [l^ and canceling out common terms gives 


/( 7 O+CTO - 1 , . . . , do + 2, do + 1, 1, . . . , 1) > (1 - p)^ 0 +ao- 2 +-+ao 
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We will now derive a matching upper bound on /(70 + (To — l,...,cro + 2,(To + l,l,...,l). Applying 
Lemma[43]with y = (70 + fio - 1,..., fxo + 1, 2,.. ., 2) and y' = (70 + cto - 1,..., ao + 1,1,..., 1) gives 


ci/(70+ <7o - 1,... ,0-0 + 1,2,... ,2) + Csf{s,2,...,2) 

s£{7o+o'o— ij-.-iCTo+l} 

> Ci(l - + 0-0 - 1,..., fJo + 1,1,..., 1) + Cs{l- 1,..., 1), 

s£{70+o'o — 1,••.,0-0+1} 

(15) 


for some positive constants ci, Cg. Now, for any s C { 70 + 1 T 0 — 1 ,... ,(Tq + 2}, observe that 7 ( 5 , 2,..., 2) < 
7 o — 1 and 7 (s, 1 ,..., 1 )<iTo — 1. Thus from our induction hypothesis, we have 

/(s, 2 ,..., 2 ) = ( 1 -p)G-N/(s, 1 ,..., 1 ). (16) 

Also, 7(70 + (To - 1,..., CJO + 1, 2,..., 2) < 70 and ( 7(70 + (To - 1, • • •, cto + 1, 2,..., 2) = cjo - 1. 
Consequently from our induction hypothesis, 

/(70 + TO - 1,... , 170 + 1, 2, . . . , 2) = (1 - p)^ 0 +- 0 - 2 +...+oo+G- 7 + 1 ^_ ( 17 ) 

Substituting these values in ( [T5] ) and canceling out common terms gives 

/(70 + TO - 1,..., TO + 2, TO + 1,1,..., 1) < (1 - p)^°+'^o-2+-+-Oa. 


We have thus proved that the hypothesis is true for x = (70 + to — 1, • • •, tq + 2, to + 1,1, • • •, 1), the base 
case for our induction on /3. 

Now consider some x* such that 'y{x*) = 70, t(x*) = tq and /3{x*) = /3o, for some /3q. Let us denote 
the components of x* as, x* = (x^,..., x"^, To + x*q, ..., tq + x*q, Xq, ..., Xq) with > X 2 > • • • > 

mi 


> To + Xq for some m > 0, mi > 1, m + mi < G. Suppose the hypothesis is true for all {x| 7 (x) = 


70 , t(x) = To, /3(x) < /3o—1}. Applying Lemma 


4.3 


with y = (x^, . . . , x;;^, TO + Xq, . . . , TO + Xq, Xq, . . . , Xq) 


and y' = (x^,..., xj;^. To + Xq - 1 ,..., To + Xq - 1 , x 


mi 


G’ • • • ’ 


Xq) gives the inequality 


mi 


Cl/(Xi, . . . , x;:;,. To + Xq, . . . , To + Xq, Xq, . . . , Xq) 

'-V-' 

mi 

+ Y Csf{s,X*Q,. . . ,Xq) 

s^{xl,...,xJ^,ao+XQ,...,ao+XQ} 

' -V-" 

mi 

> Cl(l - p)"^^f{xl,.. . ,X;;„To + Xq - 1, . . . ,To + Xq - 1,Xq, . . . ,Xq) 

'-V-' 

mi 

sO{xl,...,xJ^,ao+x’^-l,...,a-o+x’l^-l} 

'• -' 


for some positive constants ci, c^. Observe that 


7(a;l;, 


, To + Xq - 1, . . . , To + Xq - 1, x; 


G> - 




70-1 

70 


if To = 1 
otherwise. 


(18) 


mi 
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and the induction hypothesis is satisfied in the first case. In the second case, 


a{x\,... + Xq - I,... ,ao + Xq - ... ,x*q) = (Tq - 1, 

" -V-' 

mi 

and hence the induction hypothesis is satisfied in the second case as well. Thus 

f{x\,... ,xl,,ao + Xq - I,... ,ao + Xq - 1,Xg, ... ,x*g) 

'-V-' 

mi 

For any for any s C {x|,..., x*^,aQ + x*q - I,..., gq + x*q - 1}, define mi(s) := X]- l{sj = do + x^ - 

mi 

1}. Observe that if mi(s) > 0 then either 7((s, ..., x*g)) < 70 - 1 or a{{s, ..., x^)) < do - 1; 

if mi(s) = 0 then 7((s, Xq, ..., x^)) < 70 — 1- For any s C {x \,..., xj!^, do + Xg,..., do + x^}, define 

'“V'' 

mi 

■= Yli = <70 + Xq}. Observe that if mi(s) > 0 then either 7((s, Xq, ..., x^)) < 70 — 1 or 

/3((s, Xq, ..., Xg)) < /3o — 1; if i7ii(s) = 0 then 7((s, Xq, ..., Xq)) < 70 — 1- Consequently from our 
induction hypothesis we have 

^ Cs/(s,Xq,...,Xq) 

sC{xl,...,x’:^,ao+x^,...,ao+x’^} 

' -V-' 

mi 

c,(l - p)S» ^ 

sG{xl,...,x’^,ao+XQ — l,...,ao+XQ — l} 

'■ -V-' 

mi 

( 20 ) 


Substituting ( [T^ and (^1 in (181 and canceling out common terms gives 

f{xl, . . . , X);^, do + Xq, . . . , do + Xq, Xq, . . . , Xq) 


mi 


> (1 - p)^^f{xl,... ,x*^,ao + x*G - 1,... ,ao + x*G - l,x*G,... ,x*g) 


mi 


= [1 — ^)E™ i(a<*-l)+"il(o'0+3;G-l)+('S'-mi-m)(a;^-l) 


a. 


We will now employ Lemma 4.3 again to derive a matching lower bound. Setting y = {xl,---,x^, 
do + XQ,...,do + XQ,XQ + 1,...,Xq + 1) and y' = (x);,..., x)),, do + Xq, ..., do + Xq, Xq, ..., Xq) 
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in Lemma [4^ yields the inequality 


ci/K, ... ,x*^,ao + Xq, ... ,ao + x*Q,x*a + l,... ,Xq + l) 

Cs/('S, Xg + 1, . . . , + 1) 


mi 


+ E 

s<Z{xl,...,x^,ao+x’^,...,(To+x^} 


> Ci(l - p)^^f{xl,.. .,x*^,ao + XQ,.. .,ao + x*Q, Xq, ...,x*q) 
+ 


mi 


^ Cs{l-p)^ I®I/(s,Xg,...,Xg), 

sC{xJ,.. aQ+XQ,...,ao+XQ} 


( 21 ) 


for some positive constants ci, c^. Observe that 


7(xi,..., xj;^, (To + Xg, ..., cro + Xg, Xg + 1,..., Xg + 1) 

'-V-' 

mi 


70-1 

70 


and that the induction hypothesis is satisfied in the first case. In the second case, 


if do = 1 
otherwise, 


Cj(x^,..., x;;,, do + x^,..., do + x^, x^ + 1,..., x^ + 1) = do - 1, 

'-V-' 

mi 

and hence the induction hypothesis is satisfied in fhe second case as well. Thus 


/(xi,..., x;;,, do + Xg, ..., do + Xg, Xg + 1,..., Xg + 1) 

'-V-' 

mi 


( 22 ) 


Now consider any s C {x \,..., xj^, do + xL,..., do + xL}, and recall our nofafion of mi(s) := ^ - l{si = 

'-V-' 

mi 

do + Xg}. If do = 1 or if fhi(s) = 0 fhen 7 ((s,Xg + 1,..., Xg + 1)) < 70 — 1; if d > 1 and 
mi(s) > 0 fhen 7((s, Xg + 1, • • •, Xg + 1)) < 7o and d(s, Xg + 1, • • •, Xg + 1) < do — 1. If mi(s) = 0 
fhen 7((s,x^,... ,x^)) < 70 - 1, ofherwise 7((s,x^,..., x^)) < 70, d((s,x^,...,x^)) = do and 
/3((s, Xq, ..., Xg)) < /)o — 1- These terms thus satisfy our induction hypothesis and hence 

/(s, Xg + 1, • ■ ■, Xg + 1) = (1 — p)^ Xg, • • ■, Xg). (23) 


Substituting and pS]) in (21 1 gives us our desired matching lower bound 


f{xl, ...,x*^,ao + Xc,...,ao + Xq, x*g, .. 

'-V-' 

mi 




This completes the proof for {x|xi > 0 V f G [G*]}- 

We will now show that /(x) = 0 for all {x | minjgjG] x* < 0}. The arguments above for the case 
{x I minjg[G] Xj > 0} imply that for any incentive-compatible function /, the first part of Lemma 


4.3 


must be satisfied wifh equalify. This allows us fo employ fhe second pari of Lemma 4.3 For i G [G], lei 
Vi = y'i = Xi if Xi > 0, and - I = y'. = \xi\ olherwise; sel ?/* = ?/• = i? for alH G {G + 1,..., N}. 
Then the second part of Lemma 4.3 necessitates /(xi,..., xg) = 0, thus completing the proof. 
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A.4 Proof of Theorem [5^1]: Mechanism in absence of coarse belief assumption 

Without loss of generality, assume that amin = 0 since in our setting, the property of incentive compatibility 
is invariant to any constant shift and positive scale of the payment. We adopt the succinct notation of 

Oi .— Q^max Q^min- 

First consider the case of iV = G = 1. Mechanism [^reduces to /(x) = a(l — > 0}. 

Suppose without loss of generality that the worker’s beliefs for the B options are pi > • • • > pb and 
suppose m = arg max^ ^ ^ p)- ^ mechanism that is incentive compatible will strictly maximize 

the worker’s expected payment when she selects the options m}. 

Suppose a worker decides to select some £ of the B options, say options {oi,..., o^} C [i?]. Then it is 
easy to see that her expected payment. 


i 

i=l 

is maximized when she selects options {1, i.e., the £ options that are most likely to be correct. It 

remains to show that among all choices of £ G [B], the expected payment is maximized when the worker 
selects £ = m. Let $£ denote the expected payment when the worker selects £ options: 


= a 




i-i 


i=l 


Hence for any £ G {2,..., B}, we have 

h-i aZtlPii^-pY-" 


c^EliPii^-pY-^ ^-p 


1 - 


Pi 


E t 

i=lPi, 


We know that 


pt 




iPi 


< p whenever £ > m, and — > p when £ = m. Furthermore, since pi decreases 




with £ and Pi increases with £, it must also be that 
for all < m and ^ 


Pi 


ELiP. 

< 1 for all £> m, or in other words. 


> p for all £ < m. Thus we have > 1 


■ ■ ■ ^ $m—2 ^ $m—1 ^ $m+l ^ $m+2 ^ 


It follows that the worker will be incentivized to choose £ = m. 

Let us now consider the case of A = G > 1. By our assumption of the independence of the beliefs of 
the worker across the questions, the expected payment equals 


G 




Z=1 


Since the payments are non-negative, if each individual component in the product is maximized then the 
product is also necessarily maximized. Each individual component simply corresponds to the setting of 
N = G = 1 discussed earlier. Thus calling upon our earlier result, we get that the expected payment for the 
case A = G > 1 is maximized when the worker acts as desired for every question. 

Let us finally consider the general case of A > G > 1. Recall from Q that the expected payment for the 
general case is a cascade of two expectations: the outer expectation is with respect to the uniformly random 
distribution of the G gold standard questions among the N total questions, while the inner expectation is 


25 












taken over the worker’s beliefs of the different questions conditioned on the choice of the gold standard 
questions and restricts attention to only these G questions. The arguments above for the case N = G prove 
that every individual term in the inner expectation is maximized when the worker acts as desired. The outer 
expectation does not affect this argument. The expected payment is thus maximized when the worker acts 
as desired. 


A.5 Proof of Theorem [5^ Uniqueness 


Without loss of generality, assume that amin = 0 since in our setting, the property of incentive compatibility 
is invariant to any constant shift and positive scale of the payment. We adopt the succinct notation of 
a := Omax —ttmin- The proof of this theorem employs some of the tools developed in Shah and Zhou ( |2015| l. 
We begin with a lemma deriving a condition that must necessarily be satisfied by any incentive-compatible 
mechanism. Note that we are not making the coarse belief assumption and supposing that workers can have 
arbitrary beliefs. 

Lemma A.l. Any incentive-compatible mechanism must satisfy 
f{xi,.. .. .,xg) 

= (1 - p)f{xi,Xi-l,Xi, Xj+i,..., xg) + pfixi,Xi-i, -Xi, Xi+i, ...,xg), 
for every i € [G] and {xi,... ,Xi-i,Xi+i,.. .,xg) G {-{B - 1),..., -1,1,.. Xi € [B - 1]. 

Note that the lemma does not use the no-free-lunch condition. The proof of the lemma is provided at 
the end of this section. Using this lemma, we now complete the proof of the theorem. 

Consider any incentive-compatible mechanism / that satisfies the no-free-lunch condition. We first 
show that the mechanism must necessarily make a zero payment when one more more questions in the gold 
standar d are attempted incorrectly. To this end, observe that since / > 0 and p G (0,1), the statement of 
necessitates that for every i G [G] and (xi, ..., Xj_i, Xj+i, ..., xg) G {—(i? — 1),..., B}^~^, 


Lemma 


A.l 


Xi e [B - 1]: 


If f(xi,. . .,Xi-i,Xi + .. .,xg) = 0 

then f(xi,... ,Xi-i,Xi,Xi+i,.. .,xg) = /(xi,... -Xi,Xi+i,.. .,xg) = 0. 


A repeated application of this argument implies: 


If f{xi,... ,Xi-i,B,Xi+i, ...,xg) = 0 then f{xi ,..., Xi_i, Xj, Xj+i,... ,xg) = 0, 
for all Xi G { — {B — 1),..., —1,1,..., i? — 1}. 

Now consider any evaluation (xi,... ,xg) which has at least one incorrect answer. Suppose with¬ 
out loss of generality that the first question is the one answered incorrectly, i.e., xi < —1. The no- 
free-lunch condition then makes /(xi, B,... ,B) = 0. Applying our arguments from above we get that 
/(xi,X 2 ,... ,xg) = 0 for every value of (x 2 ,... ,xg) G {-{B - 1),..., -1,1,.. .,B}. 


Substituting this necessary condition in Lemma A.l[ we get that for every question i G {1,..., G} and 
every (xi,..., Xi_i, x^+i,..., xg) G [B]^~^,Xi £ [B - 1], 


/(Xl, . . . ,Xi_l,Xi + l,Xi+l, . . . ,Xg) = (1 - /0)/(xi, . . . ,Xj_l,Xi,Xi+l, ... ,xg). 


Substituting /(I,..., 1) = a, we get the desired answer. 


We now return to complete the proof of Lemma A. 1 
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Proof of Lemma |A7] First consider the case of G = iV. Consider some 77,7 G {0,, G— 1} with 7+7 < 
G. Suppose i = 77 + 7 + 1, xi,..., G [B — 1], Xr,+i,..., Xn+-y G —[B — 1] and Xn+-y+ 2 -,... ,xn = B. 

For every question j G [77 + 7 ], suppose the worker’s belief is 5j G (0, p) for the last option and 
each for the first \xj\ options. One can verify that since 6j < p < ^ and \xj\ < B — 1, it must be that 
> 6j, and that incentive-compatibility requires incentivizing the worker to select the first |xj| options. 
Suppose the worker does so. Now for every question f G {77 -|- 7 -|- 2,..., A^}, suppose the belief of the 
worker is uniform across all B options. The worker should be incentivized to select all B options in this 
case; suppose the worker does so. Finally, for question i, suppose the worker’s belief is <5 G (g, for the 
last option and each for the first IxA options. Then the worker must be incentivized to select the first 
\xi\ options alone if 5 < p, and select the last option along with the first \xi\ options if 5 > p. 

Define as rj = 6j for j G [ 77 ], and Vj = I — 6j for j G {77 + 1,77 + 7 }. Let e := 

{ei,..., e^+..y} G {—1, Incentive-compatibility for question i necessitates 


(i-«) E f . . . , trjXri', i • • • i ^77+7^77+7? . . . , -B) 


l + e.,- 


E 


j&in+i] 


TT ^ 

f{eiXl,...,erjXrj,er^+lXr^+l,...,er^+^Xrj+^,-Xi,B,...,B) | J_ Vj ^ (1-^i)' 


je [ 77 - 1 - 7 ] 


l + e.,- 


<5>p I .p-j- ^ ‘7 _ 

^ \f{(^lXl,---,er,Xrj,er^+lXr^+l,...,erj+^Xr^+^,Xi + l,B,...,B) | J_ Vj ^ (1 “ ^ 

^<P e&{-l,l}v+-i \ ie[r7+7] 

The left hand side of this expression is the expected payment if the worker chooses the first |xi| options for 
question (77 -|- 7 -|- 1 ), while the right hand side is the expected payment if she chooses the first \xi\ options 
as well as the last option. For any real-valued variable q, and for any real-valued constants a, b and c, 


q<c 

aq ^ b =► ac = b. 

q>c 

With q = 1 — 6m this argument, we get 


(!-/>) E I f (^iXl, • • • , CrjX'q^ £ 77 + 1 ^ 77+1 ? • • • ; ^ 77 + 7 ^^ 77+7 


l+e.,- 


,Xi,B,...,B) rj " (l-r7 2 


ie[?7-i-7] 


l + e .7 


+ P E ; • • • ; £ 77 + 1 ^ 77 + 1 ; • • • ; € 77 + 7 ^ 77 + 7 ; B^ ^ 


ee{-l,l}’'+T' 


ie[r7+7] 


1 + e.j' 


{ f{eiXi,...,er,Xrj,er,+lXr,+l,...,er,+^Xr,+^,Xi + l,B,...,B) r+ (l-r 7 2 " = 0 . 




ie[r7+7] 


(24) 


The left hand side of ( |2^ represents a polynomial in (77 -|- 7) variables which evaluates to zero for 

all values of the variables within an (77 -|- 7) -dimensional solid ball. Thus, the coefficients of the monomials 
in this polynomial must be zero. In particular, the constant term must be zero. The constant term appears 
when Cj = 1 V j in the summations in (|2^. Setting the constant term to zero gives 


(1 P)fi^li ■ • • ) Xq+'Yi ^V+J+ti • • • ) B) + Pfi^h • • ■ ) ^rj+'y, Xrj+'y+l, B, . . . , B) 

- f{xi, . . . , Xr,+^, Xr,+^+l + 1,B,...,B) = 0 
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as desired. Since the arguments above hold for any permutation of the N questions, this completes the proof 
for the case of G = 

Now consider the case G < N. Let g : { — {B — 1),, —1,1, • • • , B}^ —)■ M+ represent the expected 
payment given an evaluation of all the N answers, when the identities of the gold standard questions are 
unknown. Here, the expectation is with respect to the (uniformly random) choice of the G gold standard 
questions. If (xi,..., xat) G {—{B — 1),..., —1,1, • • • , B}^ are the evaluations of the worker’s answers 
to the N questions then the expected payment is 


g{xi ,.. .,xn) 



f{Xi^,...,Xi^). 


(25) 


Applying the same arguments to g as done to / above, gives 


(1 ) ^J7+7) ^»?+7+l) B, . . . , B) + pg[xi, . . . , Xrj+'ii Xrj+'^+li B ,. . . , B) 

- g{xi,Xr,+^, 3^77+7+1 + l,B,...,B) = 0. ( 26 ) 

The proof now proceeds via an induction on the quantity (G — r/ — 7 — 1 ). We begin with the case of 
(G — 7 — 7 — 1 ) = G — 1 which implies 7 = 7 = 0 . In this case ( | 2 ^ simplifies to 

(1 - p)gixi, B,...,B)+ pg{-xi,B,..., B) = g{xi + 1 ,B,...,B). 


Applying the expansion of function g in terms of function / from (^1 for some xi £ [B — 1\ gives 

(1 - p) (cifixuB, ...,B) + C 2 /(H, B,...,B)) + p {cif{-xi,B, ...,B) + C 2 f{B, B,...,B)) 

= Ci/(xi + 1, ..., H) + C2f{B, B,...,B) 


for constants ci > 0 and C 2 > 0 that respectively represent the probabilities that the first question is picked 
and not picked in the set of G gold standard questions. Cancelling out the common terms on both sides of 
the equation, we get the desired result 


(1 - P)f{xi, B,...,B)+ pf{-xi,B, ...,B) = f{xi + l,B,...,B). 

Next, we consider the case when (G — 7 — 7 — 1) questions are skipped in the gold standard, and assume 


that the result is true when more than (G — 7 — 7 — 1) questions are skipped in the gold standard. In (^ 1 , the 
functions g decompose into a sum of the constituent / functions. These constituent functions / are of two 
types: the first where all of the first (7 + 7 + 1 ) questions are included in the gold standard, and the second 
where one or more of the first (7 + 7 + 1) questions are not included in the gold standard. The second case 
corresponds to situations where there are more than (G — 7 — 7 — 1) questions skipped in the gold standard 
and hence satisfies our inducfion hypofhesis. The ferms corresponding fo fhese funcfions fhus cancel ouf in 
fhe expansion of ( |26l ). The remainder comprises only evaluations of funclion / for argumenfs in which fhe 
firsl (7 + 7 + 1 ) questions are included in fhe gold sfandard. Since fhe lasf {N — 7 — 7 — 1) questions are 
skipped by fhe worker, fhe remainder evaluafes fo 


(1 - p)c3f{xi, . . . , Xrj+j, Xi,B,...,B)+ pC3f{xi, Xrj+y, -Xi, B, . . . , B) 

= csfixi, Xr,+7, Xi + l,B,...,B) (27) 


for some consfanf C 3 > 0. Dividing fhroughouf by C 3 gives fhe desired resulf. 

Finally, fhe argumenfs above hold for any permufafion of fhe firsl G queslions, fhus completing fhe 
proof. □ 
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A.6 Proof of Theorem lot Mechanism under alternative formulation 

Without loss of generality, assume that a = 0 and 6=1 since in our setting, the property of incentive 
compatibility is invariant to any constant shift and positive scale of the payment. 

First consider the case of A = G = 1. Suppose without loss of generality that the worker’s beliefs for 
the B options are pi,... ,ps. It is easy to verify that the expected payment $sup when the worker selects 
the options {oi,..., Om}, for some m, equals 


B 

i=l 

It follows that the payment is strictly maximized when the worker selects all options whose beliefs are 
greater than B, given the assumption that none of the beliefs exactly equals a. 

The arguments above complete the proof for the case N = G = 1. The extension to N > G >1 follow 
in a manner identical to the analogous extension in the proof of Theorem |4.1[ 

The proof of Corollary |8.2| follows in an identical fashion. 

A.7 Proof of Theorem |8.3|: Negative results under alternative formulation 

We present the results of uniqueness and impossibility respectively. We will let / denote any incentive 
compatible mechanism. 

A.7.1 Part A: Uniqueness 

Consider any m G {1,..., Smax — 1}- Consider the set of beliefs pi = a + 6, p 2 = ■ ■ ■ = Pm+i = 
and Pm +2 = • • • = ps = 0, for some value of 6 in the neighborhood of 0. For the values of m under 
consideration, one can verify that a < < 1. Consequently, there exists some value 6 max > 0 such that 

for every b G [—^max; ^max] we have 0 < a + <5 < 1 and a < < 1. In order to achieve the stated 

goal, we would thus require to incentivize the worker to select options 1 through (m + 1) if <5 > 0, and 
select options 2 through (m + 1) if 6 < 0. The mechanism / therefore must satisfy the pair of inequalities 

(5<0 

/(m + 1) ^ (l-c7-(5 )/(m) + (cr + <5)/(-m). 

S>0 

Since the right hand side of the expression above is linear in <5 but the left hand side is a constant, we must 
have 


f(rn + l) = (l-(7)f(m) + af(-m) for all m G {1,..., Smax - !}• (28) 

We will return to this equation later. 

Next consider any m G Smax — 2}. Consider the set of beliefs pi = a -h b, p 2 = a + b, 

P 3 = ■ ■ ■ = Pm +2 = and Pm +3 = • • • = ps = 0, for some value of <5 in the neighborhood of 0. For 

the values of m under consideration, one can verify that a < < 1 . Consequently, there exists some 

value 6 max > 0 such that for every b G [—(5max, <5max] we have 0 < a + 6 < 1 and a < < 1- In 

order to achieve the stated goal, we would thus require to incentivize the worker to select options 1 through 
(m + 2) if 6 > 0, and select options 3 through (m + 2) if 6 < 0. The mechanism / thus must satisfy 

<5<0 

/(m + 2 ) ^ (1 — 2 (T — 2(5)/(m) + ( 2 (T + 26)/(—m). 

<5>o 
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Since the right hand side of the expression above is linear in 6 but the left hand side is a constant, we must 
have 


f{m + 2) = {l-2a)f{m) + 2af{-m) for all m G {1,..., Smax - 2}. (29) 

It follows from ( [28l ) and ( [29l ) that the values of f{m) for every m G {—(g max —1), • • •, —1,1, • • •, g max — 
2 } can be expressed in terms of a linear combination of /(g max ) and /(smax — !)■ We will now prove that 
the same holds true for /(—Smax) and /(O) as well, whenever these quantities are defined. 

The quantity /(—Smax) is defined only when Smax < B. The reason is fhaf when Smax = B, /(—Smax) = 
f{—B) corresponds fo a scenario where all fhe opfions are selecfed and fhe correcf option is nof, which is 
impossible. Now consider fhe sef of beliefs pi = a + 6, p 2 = ■ ■ ■ = Psmax+i = and 

Psmax +2 = ■ • • = Pb = 0, for some values of e > 0 and 6 in fhe neighborhood of 0. From fhe definition 
of Smax, one can easily verify that a < < 1 whenever Smax > 1- Consequently, there exist some 

values (imax > 0 and Cmax G (0, a) such that for every 6 G [-<5max, (^max] and for every e G [0, Cmax], we 
have 0 < (T + 5 < 1 and when Smax > 1, we also have a < < 1. In order to achieve the stated 

^max -L 

goal, we would thus require to incentivize the worker to select options 1 through Smax if (5 > 0, and select 
options 2 through Smax if <) < 0. The mechanism / therefore must satisfy 

( 5<0 

(1 - e)/(Smax) + e/(-Smax) ^ {I - a - 6 - e)/(Smax - 1) + (<T + (5 + e)/(-(Smax “ !))• 

( 5>0 

Since the right hand side of the expression above is linear in 6 but the left hand side does not depend onii, 
we must have 

(1 - e)/(Smax) + e/(-Smax) = (1 - CT - e)/(Smax - 1) + (cT + e)/(-(Smax “ !))• 

Since this equation must be true for every e G [0, Cmax], we must have 

/('Smax) T /( 'Smax) — /(Smax 1) T /( (Smax !))• 

Thus the term /(—Smax), whenever applicable, can also be written as a linear combination of /(smax) and 

/ (Smax !)• 

The quantity /(O) is defined only when a > The reason is fhaf when a < ^, it is malhemafically 
impossible for fhe beliefs for all fhe B options fo be less fhan or equal fo a (recall our assumpfion fhaf 
no belief equals exacfly a). Now consider fhe sef of beliefs pi = a + 6, p 2 = ■ ■ ■ = Pb = for 

some value of 5 in fhe neighborhood of 0. One can verify fhaf in Ibis case of cr > 5 , it must be that 
0 < < a. Consequently, there exists some value 5max > 0 such that for every 5 G [—5max, <^max], we 

have 0 < (T + 5 < 1 and 0 < < a. In order to achieve the stated goal, we would thus require to 

incentivize the worker to select option 1 if J > 0, and select no options if <5 < 0. The mechanism / therefore 
must satisfy 

( 5<0 

{a + 5)f{l) + {l-a-8)f{-l) ^ /(O). 

5>0 

Since the left hand side of the expression above is linear in S but the right hand side is a constant, we must 
have 


cr/(l) + (1 - o-)/(-l) = /(O). 

Thus the term /(O), whenever applicable, can also be written as a linear combination of /(smax) and 

/ ('Smax !)• 

From the arguments above, we get that the design of / has only two degrees of freedom. Given that our 
claim is only up to some shift and scale, the claim is proved. 
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A.7.2 Part B: Impossibility 


Let us first prove the result for the case of = G = 1. The result of part A of Theorem 8.3 implies 


that if there exists an incentive compatible mechanism for this setting, then the mechanism must be that of 
Mechanismj^up to a constant shift and positive scale. Consider a worker with the belief pi = l — a,p 2 = cr 
and P 3 = ■ ■ ■ pb = 0. Since a <h under an incentive compatible mechanism, the expected payment must 
be strictly larger if the worker selects only option 1 as compared to the expected payment when the worker 
selects options 1 and 2. However, one can compute that under Mechanism]^ the expected payment in the 
two cases is identical. It follows that under any possible incentive-compatible mechanism, the expected 
payment must be identical in the two following two actions of the worker (a) selecting only option 1, and 
(b) selecting options 1 and 2. It follows that no mechanism is incentive compatible. 

We now move on to the general case of > G > 1. Consider a worker who knows the answers to 
questions 2 through N with a belief of 1 in each case. Suppose that for each of these {N — 1) questions, 
this worker selects the respective options that she thinks are correct. We are now left with the first question. 
Letting X denote a random variable representing the evaluation of the worker’s response to the first question, 
the expected payment from the worker’s point of view is 




The expectation in the first term is taken with respect to the randomness in X. Defining 

f{X) :=^/(X,l,...,l) + (l-|)/(l,...,l), 

and applying the same arguments to / as those for / for the case of A^ = G = 1 above gives the desired 
contradiction. This completes the proof. 
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